HIV-1整合酶LEDGF/p75抑制剂的定量构效关系研究

Quantitative Structure -Activity Relationship Study for Inhibitors of HIV-1 Integrase LEDGF/p75 Interaction

Li, Y.; Tian, Y.J.; Xi, Y.; Qin, Z.J.; Yan, A.X.*
Curr Comput-Aided Drug Des, 2020, 16(5):654-66.

    本研究中,我们建立了定量构效关系 (QSAR) 计算模型以预测HIV-1整合酶LEDGF/p75抑制剂的生物活性。 本研究收集了190个已测定生物活性的抑制剂,选取了20种CORINA Symphony描述符来表征抑制剂及其生物活性。利用Kohonen自组织映射 (SOM) 或随机划分方法, 将190个抑制剂分为训练集和测试集。基于训练集建立了多元线性回归(MLR) 模型、支持向量机(SVM) 模型以及两个共识模型。所有模型对pIC50的预测均较好, 所有的模型对测试集的预测相关系数均高于0.7。对于训练集的共识模型C1,其表现好于其他模型,训练集的相关系数 (r) 达到0.909, 测试集的相关系数达到0.804。 通过对所选的分子描述符进行分析发现,氢键受体、原子电荷和电负性(特别是π原子) 在预测HIV-1整合酶LEDGF/p75-IN抑制剂的活性中起重要作用。

阅读文章原文

下载原始数据

Download Supporting Information

    Computational quantitative structure–activity relationship (QSAR) models were developed for predicting the bioactivity of HIV-1 integrase LEDGF/p75 inhibitors. 190 inhibitors and their bioactivities were collected in this study, which were represented by 20 selected CORINA Symphony descriptors. The 190 inhibitors was split into a training set and a test set according to a Kohonen’s self-organizing map (SOM) or random method. Multiple linear regression (MLR) models, support vector machine (SVM) models and two consensus models were built based on the training sets. All the models showed a good prediction of pIC50. The correlation coefficients of all the models were more than 0.7 on the test set. For the training set of consensus Model C1, which performed better than other models, the correlation coefficient (r) achieved 0.909 on the training set, and 0.804 on the test set. The selected molecular descriptors show that hydrogen bond acceptor, atom charges and electronegativities (especially π atom) were important in predicting the activity of HIV-1 integrase LEDGF/p75-IN inhibitors.

Read More

QSAR Models performance:   Dataset (190 HIV-1 Integrase LEDGF/p75 inhibitors)

Model Name Algorithm Descriptors Spliting methods Training set numbers Training set r Training set RMSE Training set MAE Test set numbers Test set r Test set RMSE Test set MAE
Model A1 MLR 20 CORINA Symphony descriptors Kohonen’s self-organizing map (SOM) 135 0.8403 0.0125 0.0004 55 0.7262 0.0262 0.0094
Model A2 MLR 20 CORINA Symphony descriptors Random 127 0.8331 0.0146 0.0097 63 0.7750 0.0161 0.0034
Model B1 SVM 20 CORINA Symphony descriptors Kohonen’s self-organizing map (SOM) 135 0.9244 0.0039 0.0010 55 0.7772 0.0144 0.0055
Model B2 SVM 20 CORINA Symphony descriptors Random 127 0.8399 0.0125 0.0049 63 0.7549 0.0204 0.0015

Consensus models:    4855 protease inhibitors

Model Name Spliting methods Training set numbers Training set r Training set RMSE Training set MAE Test set numbers Test set r Test set RMSE Test set MAE
Model C1 Kohonen’s self-organizing map (SOM) 135 0.9094 0.0051 0.0003 55 0.8043 0.0113 0.0074
Model C2 Random 127 0.857 0.0109 0.0073 63 0.7948 0.0134 0.0024

主要项目成员

李杨

博士研究生

田钰嘉

博士研究生

1204429112@qq.com